Streaming, Distributed Variational Inference for Bayesian Nonparametrics
نویسندگان
چکیده
This paper presents a methodology for creating streaming, distributed inference algorithms for Bayesian nonparametric (BNP) models. In the proposed framework, processing nodes receive a sequence of data minibatches, compute a variational posterior for each, and make asynchronous streaming updates to a central model. In contrast to previous algorithms, the proposed framework is truly streaming, distributed, asynchronous, learning-rate-free, and truncation-free. The key challenge in developing the framework, arising from the fact that BNP models do not impose an inherent ordering on their components, is finding the correspondence between minibatch and central BNP posterior components before performing each update. To address this, the paper develops a combinatorial optimization problem over component correspondences, and provides an efficient solution technique. The paper concludes with an application of the methodology to the DP mixture model, with experimental results demonstrating its practical scalability and performance.
منابع مشابه
Streaming Variational Bayes
Overview • Large, streaming data sets are increasingly the norm • Inference for Big Data has generally been non-Bayesian • Advantages of Bayes: complex models, coherent treatment of uncertainty, etc. We deliver: • SDA-Bayes, a framework for Streaming, Distributed, Asynchronous Bayesian inference • Experiments demonstrating streaming topic discovery with comparable predictive performance to non-...
متن کاملStreaming Variational Inference for Dirichlet Process Mixtures
Bayesian nonparametric models are theoretically suitable to learn streaming data due to their complexity relaxation to the volume of observed data. However, most of the existing variational inference algorithms are not applicable to streaming applications since they require truncation on variational distributions. In this paper, we present two truncation-free variational algorithms, one for mix...
متن کاملNonparametric Max-Margin Matrix Factorization for Collaborative Prediction
We present a probabilistic formulation of max-margin matrix factorization and build accordingly a nonparametric Bayesian model which automatically resolves the unknown number of latent factors. Our work demonstrates a successful example that integrates Bayesian nonparametrics and max-margin learning, which are conventionally two separate paradigms and enjoy complementary advantages. We develop ...
متن کاملStreaming Variational Inference for Bayesian Nonparametric Mixture Models
In theory, Bayesian nonparametric (BNP) models are well suited to streaming data scenarios due to their ability to adapt model complexity with the observed data. Unfortunately, such benefits have not been fully realized in practice; existing inference algorithms are either not applicable to streaming applications or not extensible to BNP models. For the special case of Dirichlet processes, stre...
متن کاملTruncation-free Hybrid Inference for DPMM
Dirichlet process mixture models (DPMM) are a cornerstone of Bayesian nonparametrics. While these models free from choosing the number of components a-priori, computationally attractive variational inference often reintroduces the need to do so, via a truncation on the variational distribution. In this paper we present a truncation-free hybrid inference for DPMM, combining the advantages of sam...
متن کامل